Recursiveness, Switching, and Fluctuations in a Replicating Catalytic Network 
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A protocell model consisting of mutually catalyzing molecules is studied in order to investigate 
how chemical compositions are transferred recursively through cell divisions under replication errors. 
Depending on the path rate, the numbers of molecules and species, three phases are found: fast 
switching state without recursive production, recursive production, and itinerancy between the 
above two states. The number distributions of the molecules in the recursive states are shown to be 
log-normal except for those species that form a core hypercycle, and are explained with the help of 
a heuristic argument. 

PACS numbers: 



In a cell, a huge number of chemicals is synthesized by 
mutual catalyzation leading to replication of molecules 
that allow a cell to grow until it is large enough to divide 
into two. How the underlying reaction networks give rise 
to the recursive production of cells is an important ques- 
tion, not only when considering the origin of life 0, 
but also when trying to understand the general features 
of a modern cell's biochemical reaction dynamics. 

As a simple prototype of a reproducing cell, let us con- 
sider a set of chemicals with some catalytic activities. 
How can such a system consisting of chemicals connected 
by a catalytic reaction network sustain recursive produc- 
tion? Are there any generic properties in the dynamics 
and fluctuations of such reproducing systems? 

These questions were originally addressed in connec- 
tion with the origin of life. Eigen and Schuster pro- 
posed the hypercycle as a mechanism to overcome an 
inevitable loss in the catalytic activities through muta- 
tions, while DysonQ argued that it is possible for a col- 
lection of chemicals to be sustained by mutual catalytic 
activity. Although the hypercycle itself may be suscepti- 
ble to destruction by parasitic molecules, i.e., molecules 
which are catalyzed by the hypercycle species but them- 
selves do not catalyze other molecules, it was later shown 
that compartmentalization by a cell structure or localized 
patterns in reaction-diffusion systems may suppress the 
invasion of parasitic molecules 0, 0] • 

Here, we study a simple model of mutually catalyzing 
molecules and classify the biochemical states according 
to their ability for recursive reproduction. Besides fixed 
recursive states, we find fast switching states and sev- 
eral quasi-recursive states that allow for both recursive 
reproduction and evolution. Last, we study the char- 
acteristics of the number distributions of the molecular 
species in these replicating cells. 

We envision a (proto)cell containing k molecular 
species with some of the species possibly having a zero 
population. A chemical species can catalyze the synthesis 
of some other chemical species as 
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with i,j = 1, • • • , k according to a randomly chosen re- 
action network (with a connection rate of the catalytic 
path given by p) which is kept fixed throughout each 
simulation. Furthermore, each molecular species i has a 
randomly chosen catalytic ability Cj S [0,1]. (I.e., the 
above reaction occurs with the rate Cj). Assuming an 
environment with an ample supply of chemicals available 
to the cell, the molecules then replicate leading to an 
increase in their numbers within a cell. It is the dynam- 
ics of these molecule numbers iVj of the species i under 
replication that are our main concern here. 

During the replication process, structural changes, e.g. 
the alternation of a sequence in a polymer, may occur 
that alter the catalytic activities of the molecules. The 
rate of such structural changes is given by the replication 
'error rate' p. As a simplest case, we assume that this 
'error' leads to all other molecule species with equal prob- 
ability, (i.e., with the rate /// (k — 1)). In reality of course, 
even after a structural change, the replicated molecule 
will keep some similarity with the original molecule, and 
this equal rate of transition to other molecule species is a 
drastic simplification. We therefore carried out also some 
simulations where the errors in replication only lead to a 
limited range of molecule species and found that the sim- 
plification does not affect the basic conclusions presented 
here. 

The model is simulated as follows: At each step, a pair 
of molecules, say, i and j, is chosen randomly. If there is a 
reaction path between species i and j, and i (j) catalyzes 
j (i), one molecule of the species j (i) is added with prob- 
ability Cj (c^), respectively. The molecule is then changed 
to another randomly chosen species with the probability 
of the replication error rate p. When the total number of 
molecules exceeds a given threshold (denoted as N), the 
cell divides into two such that each daughter cell inherits 
half (N/2) of the molecules of the mother cell, chosen 
randomly. In order to take the importance of the dis- 
creteness in the moluclue numbers ||J into account, we 
adopted a stochastic rather than the usual differential 
equations approach. 

The cell state at the n-th division is character- 
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ized by the molecule numbers of the chemical species 
{iVf , Ng, • • ■ , A™} (with J2j N ? = while there are 
four basic parameters; N, k, fi, and p. By investigating 
the dynamics of one thousand randomly chosen networks, 
and changing the four parameters, we have found that 
the behaviors of the system can be classified into just the 
following three types: 

(A) Fast switching states without recursiveness 

(B) Fixed recursive states 

(C) Itinerancy over several quasi-recursive states 

In phase (A), even though each generation has some 
dominating species as with regards to the molecule num- 
bers, the dominating species change every few genera- 
tions and information regarding the previously dominat- 
ing species is totally lost often to the point that its popu- 
lation drops to zero. Indeed, by autocatalytic reactions, 
the population of one dominant species can be ampli- 
fied, but soon it is replaced by another chemical that is 
catalyzed by it (see Fig. la). 

In phase (B), a recursive state is established where the 
chemical composition is stable enough to withstand the 
division process. Once reached, this state lasts very long 
(e.g., for as long the simulation lasts, say 10 4 generations) 
(see Fig. lb). 

The recursive state ('attractor') here is not necessar- 
ily a fixed point (with fluctuations) since the molecule 
numbers may oscillate in time. Nevertheless, the overall 
chemical compositions remain within certain ranges: for 
example, the major species (i.e. those that are synthe- 
sized by themselves, not by an error in the replication 
process) are not altered over the generations. Generally 
all the observed recursive states consist of 5-12 species, 
except for those species which exist only as a result of 
replication errors, (see also [lH ] for recursive transmis- 
sion of state in a network model with some structure). 

For example, in the recursive state depicted in Fig. lb, 
there are 11 species whose populations remain in exis- 
tence throughout the simulation. As is shown in Fig. 2, 
the replication of the molecules is sustained by the 'core 
hypercycle' 109 — > 11 — > 13 — > 109 where the catalytic 
activities of these core species satisfy C13 > C109 > en, 
and accordingly we have for the respective populations 
An > A109 > N13. This relationship is natural, since 
molecules with higher catalytic activities result in the 
synthesis of more molecules other than themselves thus 
suppressing their own population fractional. 

Here, through mutual catalyzation, molecules with 
higher catalytic activities are catalyzed by molecules with 
lower activities but larger populations. To destroy such a 
network of mutual support, large fluctuations in molecule 
numbers are required, which are rare for large N. Hence 
parasitic molecule species cannot easily invade the core. 

In phase (C), the system alternates between quasi- 
recursive states similar to phase (B) that last for many 
generations and fast switching states similar to phase 
(A). The quasi-recursive state itself can be subjected 
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FIG. 1: The number of molecules JV™ for the species i is plot- 
ted as a function of the generation n, i.e., at each successive 
division event n. p = 0.1, and TV = 64000. (a) ft = 0.01, 
and k = 500 (b) fi = 0.01, and k = 200 (c) fi = 0.1, and 
k = 200. For (b) and (c), the same network is adopted. Dif- 
ferent colors correspond to different species, while only some 
species (whose population becomes large during some gener- 
ation) are plotted. 




FIG. 2: The catalytic network of the species that constitute 
the recursive state 



to switches between core hypercylces as can be seen 
in Fig.lc where a switch occurs from an initial core 
hypercycle (109,11,13), to the next core hypercycle 
(11, 13, 195, 155) around the 8500th generation. Subse- 
quently, around the 12000th generation, the core network 
is taken over by parasites to enter the phase (A) like fast 
switching state which in turn gives way for a new quasi- 
recursive state around the 14000th generation. 

When N is not so large, the molecule number of the 
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species with the highest catalytic activity in the core hy- 
percycle can become small due to fluctuations, and sub- 
sequently succumb to parasitic molecules. Then, the core 
hypercycle loses its main catalyst resulting in its collapse 
giving way to a fast switching state that in turn will al- 
low the formation of a new core hypercycle (which can 
but does not have to be identical to the previous one). 

Which one of the phases (A), (B), (C) appears, of 
course, depends on the parameters and the specific struc- 
ture of the network. There is however, a clear depen- 
dence of the fraction of the networks leading to each 
of these phases on the parameter values. The frac- 
tion of (B) increases and the fraction of (A) decreases 
for increasing N, or for decreasing fc, p or p. For a 
more systematic investigation, it is useful to classify the 
phases by the similarity of the chemical compositions be- 
tween two cell division events [Toj. This can be done by 

defining a fc-dimensional vector V n =(p n (l) , ..,p n {k)) with 
p n {i) — N n (i)/N and measuring the similarity between 
i successive generations with the help of the inner prod- 
uct He, =V n ■ V n+ e /(\V n \\V n+ e\). In Fig. 3, the average 
similarity H20 and the average division time are plotted 
for 50 randomly chosen reaction networks as a function 
of the path probability p. For p > 0.2, phase (A) is ob- 
served for nearly all the networks (e.g. 48/50), while for 
lower path rates, the fraction of (C) (with a few (B)) in- 
creases. (Roughly speaking the networks with H20 > .9 
belong to C, and those with H20 < A to A). 

In general, we have found a positive correlation be- 
tween the growth speed of a cell, the similarity H , and the 
diversity of the molecules. (In Fig. 3, networks with larger 
H have smaller division times). The recursive states, es- 
tablished by a variety of species, maintain higher growth 
speeds since they effectively suppress parasitic molecules. 
In Fig. 3, for decreasing path rates, the variations in the 
division speeds of the networks become larger, and some 
networks that reach recursive states have higher division 
speeds than networks with larger p. On the other hand, 
when the path rate is too low, the protocells generally 
cannot grow since the probability to have useful connec- 
tions in the network is nearly zero. Indeed an optimal 
path rate seems to exist (e.g., around .05 for k = 200, 
N = 12800 as in Fig. 3) for which some networks have 
rather high growth speeds. Consequently, in an envi- 
ronment that necessitates competition for growth, proto- 
cells having such optimal networks will be more successful 
than protocells with sub-optimal networks. 

Finally, we investigate the fluctuations of the molecule 
numbers of each of the species. Since the number of 
molecules is not very large, the fluctuations over the gen- 
erations can possibly have a significant impact on the 
dynamics of the system. In order to quantify the sizes of 
these fluctuations, have measured the distribution P{Ni) 
for each molecule species i, by sampling over division 
events. Our numerical results are summarized as follows: 
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FIG. 3: The average similarity H20 (red), and the average 
division time (blue) are plotted as a function of the path rate 
p. For each p, data from 50 randomly chosen networks are 
plotted. The average is taken over 600 division events. The 
green line indicates the average of H20 over the 50 networks 
for each p. At p — 0.02, 25 out of 50 networks cannot support 
cell growth, 4 cannot at p = 0.04. 



(I) For the fast switching states, the distribution P(Ni) 
satisfies the power law P(Ni) w Nf a , with a » 2. 

(II) For recursive states, the fluctuations in the core 
network (i.e., 13,11,109 in Fig. 2) are typically small. On 
the other hand, species that are peripheral to but cat- 
alyzed by the core hypercycle have log-normal distribu- 
tions P(Ni) w exp(— ( l °9 N i-i°9N t ) ^ ag j n 4 

We have also plotted the variance (Ni — Ni) 2 (—is the 
average of the distribution P(Nij) and the deviation be- 
tween the peak of P(Ni) and Ni, divided by the average 
N. As can be seen, the variance and the fluctuations 
in the core network are small, especially for the minority 
species (i.e., 13). For molecule species that do not belong 
to the core hypercycle, the variance scaled by the average 
increases as the average decreases. Furthermore, there is 
a distinct deviation between the peak and the average 
(except for the core species), since the distribution has 
a tail for larger sizes. On the other hand, if we use the 
variable logNi when plotting the distribution, it is closer 
to a Gaussian, and the difference between the peak and 
the average is suppressed. 

The origin of the log-normal distributions here can be 
understood by the following rough argument: for a repli- 
cating system, the growth of the molecule number N m 
of the species m is given by dN m /dt — AN m where A is 
the average effect of all the molecules that catalyze m. 
We can then obtain the estimate dlogN m /dt — a + rj(t). 
by replacing A with its temporal average a plus fluctu- 
ations r](t) around it If T)(t) is approximated by a 
Gaussian noise, the log-normal distribution for P(N m ) 
is suggested (this argument is valid if a > 0). For the 
fast stitching state the growth of each molecule species is 
close to zero on the average and in this case, by consid- 
ering the Langevin equation with boundary conditions, 
the power law follows as discussed in [III . 

If several molecules mutually catalyze each other, due 
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FIG. 4: The number distribution of the molecules correspond- 
ing to the network in Fig. 2. The distribution is sampled from 
1000 division events. From right to left, the plotted species 
are 11,109,13,155,176,181,195,196,23. Log-Log plot. 
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FIG. 5: Scaled variance and deviation, x denotes the vari- 
ance of the molecule number divided by its average, while + 
shows the difference between the average and the peak of the 
distribution divided by the average Ni. From the right to left, 
the species 11,109,13,155,194,176,195,181,196,23,34 are plot- 
ted, i.e., the largest Ni corresponds to species 11, the third 
largest to 13, and so forth. Computed in the same way as 
Fig.4. 



to the central limit theorem, one would expect their dis- 
tributions to be close to Gaussian, and this is indeed the 
case for the three core species. 

By studying a variety of networks, the observed distri- 
butions of the molecule numbers can be generally sum- 
marized as: (l)Distribution close to Gaussian form, with 
relatively small variances in the core (hypercycle) of the 
network. (2)Distribution close to log-normal, with larger 
fluctuations for a peripheral parts of the network. (3) 
Power-law distributions for parasitic molecules that ap- 
pear intermittently 

To sum up, features of a protocell with catalytic reac- 
tions and divisions are classified into three phases. Recur- 
sive states (B) and switching over quasi-recursive states 
(C) should be noted, that maintain catalytic activities for 
cell reproduction. Besides the establishment of recursive 
growth, variability of cells in their chemical compositions 
is necessary, in order to ensure evolvability. Previously, 
we pointed out the relevance of minority molecules in 
mutually catalyzing systems for making recursive growth 
and evolution compatible[? ]. Indeed, phase (C) satis- 



fies both the features, since novel quasi-recursive states 
with different chemical compositions are visited succes- 
sively, triggered by extinctions of minority molecules in 
the core hypercycle networks. 

We showed suppression of the fluctuation of molecules 
at a core hypercycle network and ubiquity of log-normal 
distribution of those at a peripheral network, which can 
be testified for the present cell, using recent advances 
in quantitative measurements of the fluctuations. In- 
deed, it is interesting to note that the distributions 
of the abundances of fluorescent proteins, measured 
by flow-cytometry are often closer to log-normal than 
Gaussian jl^. Furthermore, e.g. the size of bacteria[l4^ 
and some cells in bloodyja] (as well as human body 
weight) obey log-normal distributions. 

I would like to thank C. Furusawa, F. Willeboordse, 
and T. Yomo for useful discussions. This research was 
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